|
|
|
| Global Multi-label Feature Selection Driven by Higher-Order Correlation and Dual Redundancy |
| DENG Wen1, SHE Yanhong2, ZHENG Wenli2, HE Xiaoli2, QIAN Ting2 |
1. School of Computer Science, Xi'an Shiyou University, Xi'an 710065; 2. School of Science, Xi'an Shiyou University, Xi'an 710065 |
|
|
|
|
Abstract Multi-label feature selection is a critical preprocessing technique for handling high-dimensional multi-label data. However, existing approaches are often trapped in local optima due to greedy search strategies or unadequate measuring feature correlation and redundancy within sparse models. To address these issues, a global multi-label feature selection algorithm driven by higher-order correlation and dual redundancy(GHC-DR) is proposed. First, a fuzzy dependency measure based on multi-label k-nearest neighbors is introduced to accurately evaluate the higher-order correlations between features and the label system. Second, GHC-DR is designed to focus on the local geometric structure of features by constructing a feature graph to capture local similarities among features, and a dual redundancy evaluation mechanism fusing information theory with local structure is developed. Finally, higher-order correlation, dual redundancy and label correlations are integrated into a unified sparse learning objective function, and an efficient closed-form solution is derived. Experiments on 15 public multi-label benchmark datasets demonstrate the superior performance of GHC-DR across multiple evaluation metrics.
|
|
Received: 23 December 2025
|
|
|
| Fund:National Natural Science Foundation of China(No.12471442) |
|
Corresponding Authors:
SHE Yanhong, Ph.D., professor. His research interests include machine learning and uncertainty data modeling.
|
About author:: DENG Wen, Master student. His research interests include fuzzy rough sets and feature selection. ZHENG Wenli, Ph.D., lecturer. Her research interests include machine learning and hierarchical classification. HE Xiaoli, Ph.D., associate professor. Her research interests include uncertainty reasoning and granular computing. QIAN Ting, Ph.D., associate professor. Her research interests include rough sets, con-cept lattices and uncertainty reasoning. |
|
|
|
[1] WEI T Y, WANG X P, WU J X, et al. Interval Type-2 Possibilistic Fuzzy Clustering Noisy Image Segmentation Algorithm with Adaptive Spatial Constraints and Local Feature Weighting & Clustering Weigh-ting. International Journal of Approximate Reasoning, 2023, 157: 1-32. [2] DENG X, FENG S H, LÜ G Y, et al. Beyond Word Embeddings: Heterogeneous Prior Knowledge Driven Multi-label Image Classification. IEEE Transactions on Multimedia, 2023, 25: 4013-4025. [3] POCZETA K, PL/AZA M, MICHNO T, et al. A Multi-label Text Message Classification Method Designed for Applications in Call/Contact Centre Systems. Applied Soft Computing, 2023, 145. DOI: 10.1016/j.asoc.2023.110562. [4] BOUTELL M R, LUO J B, SHEN X P, et al. Learning Multi-label Scene Classification. Pattern Recognition, 2004, 37(9): 1757-1771. [5] QIAN W B, YE Q Z, LI Y H, et al. Relevance-Based Label Distribution Feature Selection via Convex Optimization. Information Sciences, 2022, 607: 322-345. [6] ZHANG P, LIU G X, GAO W F, et al. Multi-label Feature Selec-tion Considering Label Supplementation. Pattern Recognition, 2021, 120. DOI: 10.1016/j.patcog.2021.108137. [7] HU L, GAO L B, LI Y H, et al. Feature-Specific Mutual Information Variation for Multi-label Feature Selection. Information Sciences, 2022, 593: 449-471. [8] PENG H C, LONG F H, DING C.Feature Selection Based on Mutual Information Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8): 1226-1238. [9] LEE J, KIM D.Fast Multi-Label Feature Selection Based on Infor-mation-Theoretic Feature Ranking. Pattern Recognition, 2015, 48(9): 2761-2771. [10] DAI J H, CHEN J L, LIU Y, et al. Novel Multi-label Feature Selection via Label Symmetric Uncertainty Correlation Learning and Feature Redundancy Evaluation. Knowledge-Based Systems, 2020, 207. DOI: 10.1016/j.knosys.2020.106342. [11] ZHANG P, LIU G X, GAO W F.Distinguishing Two Types of Labels for Multi-label Feature Selection. Pattern Recognition, 2019, 95: 72-82. [12] BUGATA P, DROTAR P.On Some Aspects of Minimum Redundancy Maximum Relevance Feature Selection. Science China(Information Sciences), 2020, 63(1). DOI:10.1007/s11432-019-2633-y. [13] ZHANG J, LIN Y D, JIANG M, et al. Fast Multilabel Feature Selection via Global Relevance and Redundancy Optimization. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(4): 5721-5734. [14] ROFFO G, MELZI S, CASTELLANI U, et al.Infinite Feature Selection: A Graph-Based Feature Filtering Approach. IEEE Transa-ctions on Pattern Analysis and Machine Intelligence, 2021, 43(12): 4396-4410. [15] YIN T Y, CHEN H M, YUAN Z, et al. Noise-Resistant Multilabel Fuzzy Neighborhood Rough Sets for Feature Subset Selection. Information Sciences, 2023, 621: 200-226. [16] ZHOU G Z, LI R X, SHANG Z H, et al. Multi-label Feature Selection Based on Minimizing Feature Redundancy of Mutual Information. Neurocomputing, 2024, 607. DOI: 10.1016/j.neucom.2024.128392. [17] FAN Y L, CHEN B H, HUANG W Q, et al. Multi-label Feature Selection Based on Label Correlations and Feature Redundancy. Knowledge-Based Systems, 2022, 241. DOI: 10.1016/j.knosys.2022.108256. [18] LEE J, KIM D.Feature Selection for Multi-label Classification Using Multivariate Mutual Information. Pattern Recognition Letters, 2013, 34(3): 349-357. [19] LIN Y J, HU Q H, LIU J H, et al. Streaming Feature Selection for Multilabel Learning Based on Fuzzy Mutual Information. IEEE Transactions on Fuzzy Systems, 2017, 25(6): 1491-1507. [20] LIAO C W, YANG B.A Novel Multi-label Feature Selection Me-thod Based on Conditional Entropy and Its Acceleration Mechanism. International Journal of Approximate Reasoning, 2025, 185. DOI: 10.1016/j.ijar.2025.109469. [21] DAI J H, HUANG W Y, ZHANG C C, et al. Multi-label Feature Selection by Strongly Relevant Label Gain and Label Mutual Aid. Pattern Recognition, 2024, 145. DOI: 10.1016/j.patcog.2023.109945. [22] LIU J H, LIN Y J, DU J X, et al. ASFS: A Novel Streaming Feature Selection for Multi-label Data Based on Neighborhood Rough Set. Applied Intelligence, 2022, 53(2): 1707-1724. [23] SUN L, YIN T Y, DING W P, et al. Feature Selection with Mi-ssing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy. IEEE Transactions on Fuzzy Systems, 2022, 30(5): 1197-1211. [24] LIU J H, LIN Y J, DING W P, et al. Multi-label Feature Selection Based on Label Distribution and Neighborhood Rough Set. Neurocomputing, 2023, 524: 142-157. [25] HUANG J, LI G R, HUANG Q M, et al. Learning Label Specific Features for Multi-label Classification // Proc of the IEEE International Conference on Data Mining. Washington, USA: IEEE, 2015: 181-190. [26] ZHANG M L, WU L.LIFT: Multi-label Learning with Label-Spe-cific Features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107-120. [27] REN T T, JIA X Y, LI W W, et al. Label Distribution Learning with Label-Specific Features // Proc of the 28th International Joint Conference on Artificial Intelligence. San Francisco, USA: IJCAI, 2019: 3318-3324. [28] CAI Z L, ZHU W.Multi-label Feature Selection via Feature Manifold Learning and Sparsity Regularization. International Journal of Machine Learning and Cybernetics, 2018, 9(8): 1321-1334. [29] MURALI V. Fuzzy Equivalence Relations. Fuzzy Sets and Systems, 1989, 30(2): 155-163. [30] DUBOIS D, PRADE H.Rough Fuzzy Sets and Fuzzy Rough Sets. International Journal of General Systems, 1990, 17(2/3): 191-209. [31] YIN T Y, CHEN H M, YUAN Z, et al. A Robust Multilabel Feature Selection Approach Based on Graph Structure Considering Fuzzy Dependency and Feature Interaction. IEEE Transactions on Fuzzy Systems, 2023, 31(12): 4516-4528. [32] WU B Y, LIU Z L, WANG S F, et al. Multi-label Learning with Missing Labels // Proc of the 22nd International Conference on Pattern Recognition. Washington, USA: IEEE, 2014: 1964-1968. [33] HE Z X, LIN Y J, LIN Z L, et al. Multi-label Feature Selection via Similarity Constraints with Non-negative Matrix Factorization. Knowledge-Based Systems, 2024, 297. DOI: 10.1016/j.knosys.2024.111948. [34] ZHANG J, LUO Z M, LI C D, et al. Manifold Regularized Discri-minative Feature Selection for Multi-label Learning. Pattern Recognition, 2019, 95: 136-150. [35] WU Y, LI P P, ZOU Y Z.Partial Multi-label Feature Selection with Feature Noise. Pattern Recognition, 2025, 162. DOI: 10.1016/j.patcog.2024.111310. [36] SHANG R H, ZHONG J Y, ZHANG W T, et al. Multilabel Feature Selection via Shared Latent Sublabel Structure and Simultaneous Orthogonal Basis Clustering. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(3): 5288-5303. [37] FRIEDMAN M.A Comparison of Alternative Tests of Significance for the Problem of m Rankings. The Annals of Mathematical Statistics, 1940, 11(1): 86-92. [38] DUNN O J.Multiple Comparisons Among Means. Journal of the American Statistical Association, 1961, 56(293): 52-64. [39] DEMŠAR J. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 2006, 7: 1-30. |
|
|
|